	*	     *
* 		*	    
			       
--- spaceTexts corpus v1.0 --- 
			      
    *	*	*
*	  *		*


This is the README file for the spaceTexts corpus v1.0.



-- Details:

This dataset contains digitized english translations of the record of speeches delivered during the General Debate/General Exchange of Views at the yearly meetings of the United Nations Committee on the Peaceful Uses of Outer Space. 

There are currently 888 statements from 1961-1993 and these are stored in plain text format. The file names contain the following metadata: country/organization name (ISO-3 code), session number, year, state/nonstate actor indicator, and general statement/statement of reply indicator. 

For example, "FRA_4_1961_S_G.txt" refers to France's (FRA) general debate speech (G) in 1961 which corresponds to the 4th session of COPUOS. France is a state (S) and not a nonstate commercial or non-governmental organization, which would be indicated by "NS." The corresponding codes and speakers by year are available in the "Speaker List.csv" file. 

Further, the "Raw Transcripts" folder contains the original images of the speech transcripts. For more on UN COPUOS, see http://www.unoosa.org/oosa/en/ourwork/copuos/index.html 




-- Publication:

Please cite the following paper - which provides a fuller description and exploratory analyses - if using the dataset in published work:

Pomeroy, Caleb (2017) "spaceTexts: A New Corpus of Speeches in the UN Committee on the Peaceful Uses of Outer Space," In the 2017 International Conference on the Frontiers and Advances in Data Science (FADS), pp. 41-46, IEEE. DOI: 10.1109/FADS.2017.8253191

If referring to individual raw transcripts, please also consult the United Nations citation guidelines.



-- Notes / Steps forward:

- In 1966, 1968, 1970, 1973 there's no formal General Debate section, but speakers expressed their general remarks during the reports of the committee to the General Assembly agenda item. 

- In 1974, part of the US and France speeches appear to be missing from the transcripts. This might also be the case for the US, Egypt, and France in 1975.

- I'm in search of the transcripts for the years 1958-1960, 1971, 1986, 1987, and 1994-present. 



-- Acknowledgements:

Funding for v1.0 of the project is gratefully acknowledged from the Battelle Center for Science and Technology Policy at The Ohio State University's John Glenn College of Public Affairs. Thanks to the staff at the UN Office for Outer Space Space Affairs and New York University's Bopst Library, and Slava Mikhaylov, who has a similar project for the General Debate speeches at the UN General Assembly Level at DOI: 10.7910/DVN/0TJX8Y





